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(54) TiUe: SINGLE-CHAIN FORMS OF THE GLYCOPROTEIN HORMONE QUARTET 
(57) Abstract 

Single-chain forms of the glycoprotein hormone quartet, at least some members of which are found in most vertebrates, are disclosed. 
In one embodiment of these single-chain forms, the a and 0 subunits of the wild-type heterodimers or their variants arc covalently linked, 
optionally through a linker moiety. A drug may further be included within the linker moiety to be targeted to receptors for these hormones. 
Some of the single-chain forms are agonists and others antagonists of the glycoprotein hormone activity. Another embodiment of the single- 
chain compounds of the invention comprises two 0 subunits of the glycoprotein hormones, which 0 subunits arc the same or different. 
These '*twa-^" forms are antagonists of glycoprotein hormone activity. 
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SINGLE-CHAIN FORMS OF THE GLYCO PROTEIN HORMONE QUARTET 



Acknowledgment of Government Support 

This invention was made with government support under 
NIH Contract No* NOl-HD-9 -2922 , awarded by the National 
Institutes of Health. The government has certain rights in this 
invention. 

The invention relates to the field of protein 
engineering and the glycoprotein hormones which occur normally 
as heterodimers. More specifically, the invention concerns 
single -chain forms of chorionic gonadotropin (CG) , thyroid 
stimulating hormone (TSH) , luteinizing hormone (LH) , and 
follicle stimulating hormone (FSH) , 

P^C)cqr9vn<g ftyt 

In humans, four important glycoprotein hormone 
heterodimers (LH, FSH, TSH and CG) have identical a subunits and 
differing 0 stibunits. Three of these hormones are present in 
virtually all other vertebrate species as well; CG has so far 
been found only in primates and in horse placenta and urine. 

PCT application WO90/09800, published 7 September 
1990, and incorporated herein by reference, describes a number 
of modified forms of these hormones. One important modification 
is C- terminal extension of the /? subunit by the carboxy terminal 
peptide of human chorionic gonadotropin or a variant thereof. 
Other muteins of these hormones are also described. The 
relevant positions for the CTP are from any one of positions 
112-118 to position 145 of the 0 subunit of human chorionic 
gonadotropin. The PCT application describes variants of the CTP 
extension obtained by conservative amino acid substitutions such 
that the capacity of the CTP to alter the clearance 
characteristics is not destroyed. In addition, U.S. Serial No. 
08/049,869 filed 20 April 1993, incorporated herein by 
reference, describes modifying these hormones by extension or 
insertion of the CTP at locations other than the C- terminus and 
CTP fragments shorter than the sequence extending from positions 
112-118 to 145. 
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The CTP- extended /S subunit of FSH is also described in 
two papers by applicants herein: LaPolt, P.S. et al.; 
Endocrinology (1992) iii: 2514-2520 and Fares, F,A. et al.; Proc 
Natl Acad Sci USA (1992) fi5.:4304-4308 . Both of these papers are 
5 incorporated herein by reference. 

The crystal structure of the heterodimeric form of 
human chorionic gonadotropin has now been published in more or 
less contemporaneous articles; one by Lapthorn, A.J. et al. 
Nature (1994) 369 :455-461 and the other by Wu, H. et al. 

10 Structure (1994) 2:545-558. The results of these articles are 
summarized by Patel, D.J. Nature (1994) 369 :438-439 . 

At least one instance of preparing a successful 
single -chain form of a heterodimer is now known. The naturally 
occurring sweetener protein, monellin, is isolated from 

15 serendipity berries in a heterodimeric form. Studies on the 
crystal structure of the heterodimer were consistent with the 
proposition that the C- terminus of the B chain could be linked 
to the N- terminus of the A chain through a linker which 
preserved the spatial characteristics of the heterodimeric form. 

20 Such a linkage is advantageous because, for use as a sweetener 
protein, it would be advantageous to provide this molecule in a 
form stable at high temperatures. This was successfully 
achieved by preparing the single- chain form, thus impeding heat 
denaturation, as described in U.S. patent 5,264,558. 

25 PCT application W091/16922 published 14 November 1991 

describes a multiplicity of chimeric and otherwise modified 
forms of the heterodimeric glycoprotein hormones. In general, 
the disclosure is focused on chimeras of ot subunits or 0 
subunits involving portions of various a or jS chains 

30 respectively. One construct simply listed in this application, 
and not otherwise described, fuses substantially all of the 0 
chain of human chorionic gonadotropin to the a subunit 
preprotein, i.e., including the secretory signal sequence for 
this subunit. This construct falls outside the scope of the 

3 5 present invention since the presence of the signal sequence 
intervening between the )8 and -a chains fails to serve as a 
linker moiety as defined and described herein. 

It has now been found that the normally heterodimeric 
glycoprotein hormones retain their properties when in single- 
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Chain form, including single- chain forms that contain the 
various CTP extensions and insertions described above. 

Pj,gCl<?gv;yg g£ thg j;yivgnt;i,Qpi 

The invention provides single -chain forms of the 
5 glycoprotein hormones, at least some of which hormones are found 
in most vertebrate species. The single-chain forms of the 
invention may either be glycosylated, partially glycosylated, or 
nonglycosylated and the a and 0 chains (or a and a or ^ and jS) 
that occur in the native glycoprotein hormones or variants of 

10 them may optionally be linked through a linker moiety. 

Particularly preferred linker moieties include the carboxy 
terminal peptide (CTP) unit either as a complete unit or only as 
a portion thereof. The resulting single- chain hormones either 
retain the activity of the unmodified heterodimeric form or are 

15 antagonists of this activity. 

Thus, in one aspect, the invention is directed to a 
glycosylated or nonglycosylated protein which comprises the 
amino acid sequence of the a subunit common to the glycoprotein 
hormones linked covalently, optionally through a linker moiety, 

2 0 to the amino acid sequence of the 0 subunit of one of said 

hormones, or variants of said amino acid sequences wherein said 
variants are defined herein. 

In another aspect, the invention is directed to a 
glycosylated or nonglycosylated protein which comprises the 

25 amino acid sequence of the j5 subxinit of a member of the 

glycoprotein hormone c[uartet linked covalently, optionally 
through a linker moiety, to the amino acid sequence of the 
subunit of one of said hormones, or variants of said amino acid 
sequences wherein said variants are defined herein. 

30 . V In ajiother aspect, the invention is directed to a 

glycosylated or nonglycosylated protein which comprises the 
amino acid sequence of the ot subunit of the glycoprotein hormone 
quartet linked covalently, optionally through a linker moiety, 
to the amino acid sequence of another or subunit, or variants of 

35 said amino acid sequences wherein said variants are defined 
herein. 

In still another aspect, the invention is directed to 
glycosylated or nonglycosylated single- chain forms of the 



J 
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biologically important dimers whose efficacy is presaged by the 
single- chain forms of the hormone quartet. Thus, the invention 
is also directed to the single- chain forms of interleukins 3 and 
12 (IL-3 and IL-12), tumor necrosis factor (TNF) , transforming 
5 growth factor (TGF) as well as inhibin. Also included are 
hybrid interleukins such as single chain forms of one subunit 
from IL-3 and the other from XL- 12. 

In other aspects, the invention is directed to 
recombinant materials and methods to produce the single- chain 
10 proteins of the invention, to pharmaceutical compositions 

containing them; to antibodies specific for them; and to methods 
for their use. 



Brief Description of the Drawings 

Figure 1 shows the construction of a Sail bounded DNA 
15 fragment fusing the third exon of CG/S with the second exon 
encoding the oc subunit. 

Figure 2 shows the amino acid sequence and numbering 
of positions 112-145 of human CG)S. 

Figure 3 shows the results of a cort^etition binding 
20 assay for FSH receptor by various FSH analogs. 

Figure 4 shows the results of signal transduction 
assay with respect to FSH receptor for various FSH analogs. 

Modes of Carrying Out the Invention 

Four "glycoprotein" hormones in humans provide a 

25 family which includes h\iman chorionic gonadotropin (hCG) , 

follicle stimulating hormone (FSH), luteinizing hormone (LH) , 
and thyroid stimulating hormone (TSH) . As used herein, 
"glycoprotein hormones" refers to the members of this family, 
whether found in humans or in other vertebrates. All of these 

30 hormones are heterodimers comprised of a -subunits which, for a 
given species, are identical in amino acid sequence among the 
group, and 0 svibunits which differ according to the member of 
the fcimily. Thus, normally these glycoprotein hormones occur as 
heterodimers composed of a and jS subunits associated with each 

35 other but not covalently linked. Most vertebrates produce FSH, 
TSH and LH; chorionic gonadotropin has been found only in 
primates, including humans, and horses. A specific form of CG 
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from horses has been designated pregnant mare semm glycoprotein 



heterodimers wherein the a and 0 subunits of each are encoded in 
5 different genes and are separately synthesized by the host • The 
host then assembles the separately synthesized subunits into a 
non-covalently linked heterodimeric complex. In this manner, 
the heterodimers of this hormone quartet differ from 
heterodimers such as insulin which is synthesized from a single 

10 gene (in this case with an intervening "pro" sequence) and the 
subunits are covalently coupled using disulfide linkages. This 
hormone quartet is also distinct from the immunoglobulins which 
are assembled from different loci, but are covalently bound 
through disulfide linkages. On the other hand, monellin, which 

15 is, however, a plant protein, is held together through 

noncovalent interaction between its A and B chains. It is not 
known at present whether the two chains are encoded on separate 
genes. 



20 determining the behavior of biologically active compounds which 
are dimers formed from subunits that are identical or different. 
The subunits may be covalently or noncovalently linked; they may 
be synthesized by the same or different genes; and they may or 
may not contain, in their precursor forms, a "pro" sequence 

25 linking the two members of the dimer. Based on the results 

obtained with the single -chain forms of the glycoprotein hormone 
quartet herein, it is apparent that single- chain forms of the ' 
biologically active dimers interleukin- 12 , interleukin-3 (IL-12 
and IL-3) , inhibin, tumor necrosis factor (TNF) , and 

30 transforming growth factor (TGF) will also be biologically 
active. 

The single- chain forms of the heterodimers or 
hbmodimers have a number of advantages over their dimeric forms. 
First, they are generally more stable. LH, in particular, is 
35 noted for its instability and short half -life. Second, problems 
of recombinant production are reduced since only a single gene 
need be transcribed, translated and processed. This is 
particularly important for expression in bacteria. Third, of 
course, they provide an alternate form thus permitting fine 



(PMSG) . 



Thus, this hormone "quartet" is composed of 



Thus, a variety of factors is influential in 
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tuning of activity levels and of in vivo half lives. Finally, 
single chain forms are unique starting materials for identifying 
truncated forms with the activity of the dimer. The linkage 
between the subunits permits the protein to be engineered 
5 without disturbing the overall folding of the protein. 

Features of the M embers of the Quartet 

The /S subunit of hCG is substantially larger than the 
other 0 subunits in that it contains approximately 34 additional 
amino acids at the C- terminus referred to herein as the carboxy 

10 terminal portion (CTP) which, when glycosylated at the O- linked 
sites, is considered responsible for the comparatively longer 
serum half -life of hCG as compared to other gonadotropins 
(Matzuk, et al . , Endocrinol (1989) 126 :376) > In the native 
hormone, this CTP extension contains four mucin- like O- linked 

15 ol igosacchar ides • 

In one embodiment of the present invention, the a and 
0 chains of the glycoprotein hormones are coupled into a single - 
chain proteinaceous material where the ot and /3 chain are 
covalently linked, optionally through a linker moiety. The 

20 linker moiety may include further amino acid sequence, and in 
particular the CTP units described herein can be advantageously 
included in the linker. In addition, the linker may include 
peptide or nonpeptide drugs which can be targeted to the 
receptors for the hormones, 

25 In addition to the head- to- tail configuration that is 

achievable by simply coupling the two peptide chains through a 
peptide bond, the of and chains can be linked head- to- head or 
tail- to- tail • Head to head and tail to tail couplings involve 
synthetic chemistry using standard techniques to link two 

3 0 carboxy 1 or two amino groups through a linker moiety. For 

example, two amino groups may be linked through an anhydride or 
through any dicarboxylic acid derivative; two carboxyl groups 
can be linked through diamines or diols using standard 
activation techniques. However, the most preferred form is a 

3 5 head to tail configuration wherein standard peptide linkages 
suffice and the single- chain compound can be prepared as a 
fusion protein recombinantly or using synthetic peptide 
techniques either in a single chain or, preferably, ligating 
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individual portions of the entire sequence. Of course, if 
desired, peptide or non-peptide linker moieties can be used in 
this case as well, but this is unnecessary and the convenience 
of recombinant production of the single- chain protein would 
5 suggest that embodiments that permit this method of production 
comprise by far the most preferred approach. 

When a head- to- tail configuration is employed, linkers 
may consist essentially of additional peptide sequence. As is 
the case with the heterodimers , the two 0 chains may be linked 

10 through a CTP unit as further described below. Thus, possible 
embodiments of the invention include, with the N- terminus at the 
left, ar-FSH/S, fiFSH-a, Of-^SLH, Qf-CTP-)8LH, )SLH-CTP-a, CTP-^SLH-CTP- 
a; and the like. 

The single chain forms of the heterodimeric 

15 gonadotropins or glycoprotein quartet also relate to additional 
important sets of embodiments wherein rather than coupling the a 
and 0 subunits, two jS or two a subunits may be coupled together 
to form a single- chain compound. As with the hetero-dimer , the 
coupling can be head to head, head to tail, or tail to tail, 

20 The "two-/S" single- chain tandem peptides are 

especially useful as antagonists for the receptors normally 
activated by the heterodimeric glycoprotein hormones. Since the 
a subunit is believed largely responsible for signal 
transduction, while /3 subunit confers receptor specificity, and 

25 since the a and subunits have similar conformations, the 
single-chain compounds should be able specifically to bind a 
receptor for which at least one 0 chain is present without 
activating the receptor. 

The antagonist activity of the "2-i8" single-chain 

30 tandem peptides is based in part on the crystal structure of the 
heterodimers. It is noted that the a and chains have similar 
cystine-knot configurations and that some of the folding 
patterns of the two chains are analogous. 

The "two-/3" single-chain compounds of the invention 

35 may be designed to contain tandem copies of the same 0 

subunit i.e., FSH)S-FSH/3; HCG/3-HCG)S; TSH)S-TSHiJ; or LH/S-LH/S; or 
chimeric single- chain compounds may be employed such as HCGjS- 
FSHjS; FSH)S-LH)8; LH/J-TSH/S and the like. There are a total of 12 
such possible combinations. In addition, the carboxyl terminal 



. ILi 
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peptide (CTP) of the HCG /3 subiinit improves the conformation of 
the single- chain compound when present between the two /S chains. 
This is automatic when HCG 0 is the upstream portion; however in 
other instances, it is convenient to employ a CTP subunit as 
5 described herein at the carboxyl terminus of the upstream 

participant. Two such CTP units are also included within the 
invention scope. Thus, preferred embodiments include FSH)3-CTP- 
FSH/S; FSH^-CTP-CTP-FSH)3; LH)8- CTP- FSHjS ; LH)S-CTP-CTP-FSH/3 and the 
like. 

10 Similar descriptions apply to the "two Of" single chain 

compounds, except, of course, that chimeric pairs are not 
included other than with respect to a variants. Various 
linkers, preferably CTP-based, and CTP extensions are also 
included. 

15 The following definitions may be helpful in describing 

the single -chain forms of the molecules. 

As used herein, a subunit, and FSH, LH, TSH, and CG /S 

sxibunits as well as the heterodimeric forms have in general 

their conventional definitions and refer to the proteins having 
20 the amino acid sequences known in the art per se, or allelic 

variants thereof, regardless of the glycosylation pattern 

exhibited. 

"Native" forms of these peptides are those which have 
the amino acid sequences isolated from the relevant vertebrate 
25 tissue, and have these known sequences per se, or their allelic 
variants . 

"Variant" forms of these proteins are those which have 
deliberate alterations in amino acid sequence of the native 
protein produced by, for example, site- specif ic mutagenesis or 
30 by other recombinant manipulations, or which are prepared 
synthetically. 

These alterations consist of 1-10, preferably 1-8, and 
more prefersLbly 1-5 amino acid changes, including deletions, 
insertions, and substitutions, most preferably conservative 
35 amino acid substitutions as defined below. The resulting 

variants must retain activity which affects the corresponding 
activity of the native hormone -- i.e., either they must retain 
the biological activity of the native hormone directly, or they 
must behave as antagonists, generally by virtue of being able to 
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bind the receptors for the native hormones but lacking the 
ability to effect signal transduction. For exair5)le, it is known 
that if the glycosylation site at position 52 of the a subunit 
is removed by an amino acid substitution, therefore preventing 
all glycosylation at that site, the hormones which are 
heterodimers with this altered a subunit are generally agonists 
and are able to bind receptors preventing the native hormone 
from doing so in competition. {On the other hand, the 
glycosylation site of the a subunit at position 78 appears not 
greatly to affect the activity of the hormones.) Other 
alterations in the amino acid sequence may also result in 
antagonist rather than agonist activity for the variant . 

One set of preferred variants are those wherein the 
glycosylation sites of either the a or )3 sxibunits or both have 
been altered. The a subunit contains two glycosylation sites, 
one at position 52 and the other at position 78, and the effect 
of alterations of these sites on activity has just been 
described. Similarly, the /S subunits generally contain two 
N-linked glycosylation sites (at positions that vary somewhat 
with the nature of the chain) and similar alterations can be 
made at these sites. The CTP extension of hCG contains four 
O- linked glycosylation sites, and conservative mutations at the 
serine residues (e.g., conversion of the serine to alanine) 
destroys these sites. Destruction of the O- linked glycosylation 
sites may effect conversion of against activity to antagonist 
activity. 

Finally, alterations in amino acid sequence that are 
proximal to the N-linked or O- linked glycosylation sites 
influence the nature of the glycosylation that is present on the 
resulting molecule and also alter activity. 

Alterations in amino acid sequence also include both 
insertions and deletions. Thus, truncated forms of the hormones 
are included among variants, e.g., mutants of the ot subunit 
which are lacking some or all of the amino acids at positions 
85-92 at the C- terminus. In addition, ot subunits with l-lO 
amino acids deleted from the N- terminus are included. Some 
useful variants of the hormone quartet described herein are set 
forth in U.S. Patent 5,177,193 issued 5 January 1993 and 
incorporated herein by reference. As shown therein, the 



in 
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glycosylation patterns can be altered by destroying the relevant 
sites or, in the alternative, by choice of host cell in which 
the protein is produced. 

As explained above, the single chain forms are 
5 convenient starting materials for various engineered muteins. 
Such muteins include those with non- critical regions altered or 
removed. Such deletions and alterations may comprise entire 
loops, so that sequences of considerably more than 10 amino 
acids may be deleted or changed. The single chain molecules 

10 must, however, retain at least the receptor binding domains 
and/or the regions involved in signal transduction. 

There is considerable literature on variants of the 
hormone quartet described herein and it is clear from this 
literature that a large number of possible variants which result 

15 both in agonist and antagonist activity can be prepared. Such 
variants are disclosed, for example, in Chen, F. et al • Molec 
Endocrinol (1992) 5.:914-919; Yoo, J. et al. J Biol Chem (1993) 
2iJ.:13034-13042; Yoo, J. et al . J Biol Chem (1991) 2£fi:17741- 
17743; Puett, D. et al. Glycoprotein Hormones , Lusbader, J.W. et 

2 0 al, EDS, Springer Verlaa New York (1994) 122-134; Kuetmann, H.T. 

et al. (ibid) pages 103-117; Erickson, L.D. et al . Endocrinol oorv 
(1990) 126:2555-2560; and Bielinska, M. et al . J Cell Biol 
(1990) 111:330a (Abstract 1844). 

As described hereinabove, one method of constructing 

25 effective antagonists is to prepare a single- chain molecule 

containing two subunits of the same or different member of the 
glycoprotein quartet • Particularly preferred variants of these 
single- chain forms include those wherein one or more cystine- 
link is deleted, typically by substituting a neutral amino acid 

30 for one or both cysteines which participate in the link. 

Particularly preferred cystine links which may be deleted are 
those between positions 2 6 and 110 and between positions 23 and 
72 . 

In addition, it has been demonstrated that the /8 

3 5 subunits of the hormone quartet can be constructed in chimeric 

forms so as to provide biological functions of both components 
of the chimera, or, in general, hormones of altered biological 
function. Thus, chimeric molecules which exhibit both FSH and 
LH/CG activities can be constructed as described by Moyle, Proc 
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Natl Acad Sci (1991) M:760-764; Moyle, Nature (1994) Mfi:251- 
255. As disclosed in these papers, substituting cimino acids 
101-109 of FSH-i3 for the corresponding residues in the CG-/S 
sxibunit yields an analog with both hCG and FSH activity. 
5 These chimeric forms of 0 subunits can also be used in 

the single- chain compounds which couple two /S subunits into a 
single molecule. 

Although it is recognized that glycosylation pattern 
has a profound influence on activity both qualitatively and 

10 quantitatively, for convenience the terms FSH, LH, TSH, and CG /3 
subunits refers to the amino acid seqiience characteristic of the 
peptides, as does "a subunit," When only the 0 chain is 
referred to, the terms will be, for example, FSH/S; when the 
heterodimer is referred to, the simple term "FSH" will be used, 

15 It will be clear from the context in what manner the 

glycosylation pattern is affected by, for exaunple, recombinant 
expression host or alteration in the glycosylation sites- Foirms 
of the glycoprotein with specified glycosylation patterns will 
be so noted, 

20 As used herein "peptide" and "protein" are used 

interchangeably, since the length distinction between them is 
arbitrary - 

In the single-chain forms of the present invention, 
the a and/or jS chain may contain a CTP extension inserted into a 

25 noncritical region, 

"Noncritical" regions of the a and subunits are 
those regions of the molecules not required for biological 
activity (including agonist and antagonist activity). In 
general, these regions are removed from binding sites, precursor 

30 cleavage sites, and catalytic regions. Regions critical for 
inducing proper folding, binding to receptors, catalytic 
activity and the like should be avoided; similarly, regions 
which are critical to assure the three-dimensional conformation 
of the protein should be avoided. It should be noted that some 

35 of the regions which are critical in the case of the dimer 
become non- critical in the single chain forms since the 
conformational restriction imposed by the single chain may 
obviate the necessity for these regions. The ascertainment of 
noncritical regions is readily accomplished by deleting or 




wo 96/05224 PCT/US95/09664 

- 12 - 

modifying Ccmdidate regions and conducting an appropriate assay 
for the desired activity. Regions where modifications result in 
loss of activity are critical; regions wherein the alteration 
results in the same or similar activity (including antagonist 
5- activity) are considered noncritical. 

It should be emphasized, that by "biological activity" 
is meant activity which is either agonistic or antagonistic to 
that of the native hormones. Thus, certain regions are critical 
for behavior of a variant as an antagonist, even though the 

10 antagonist is unable to directly provide the physiological 
effect of the hormone. 

For example, for the ot subunit, positions 33-59 are 
thought to be necessary for signal transduction and the 20 amino 
acid stretch at the carboxy terminus is needed for signal 

15 transduction/receptor binding. Residues critical for assembly 
with the /S subunit include at least residues 33-58, particularly 
37-40 . 

Where the noncritical region is "proximal" to the N- 
or C- terminus, the insertion is at any location within 10 amino 

2 0 acids of the terminus, preferably within 5 amino acids, and most 

preferably at the terminus per se. 

In general, "proximal" is used to indicate a position 
which is within 10 amino acids, preferably within five amino 
acids, of a referent position, and most prefercibly at the 
25 referent position per se. Thus, certain variants may contain 

substitutions of amino acids "proximal" to a glycosylation site; 
the definition is relevant here- In addition, the a and )S 
subunits may be linked to each other at positions "proximal" to 
their N- or C- termini. 

3 0 As used herein, the "CTP unit" refers to an amino acid 

sequence found at the carboxy terminus of human chorionic 
gonadotropin 0 sxibunit which extends from amino acid 112-118 to 
residue 145 at the C- terminus or to a portion thereof. Thus, 
each "coit^lete" CTP unit contains 28-34 amino acids, depending 
35 on the N- terminus of the CTP. The native sequence of positions 
112-145 is shown in Figure 2. 

By a "partial" CTP unit is meant an amino acid 
sequence which occurs between positions 112-118 to 145 
inclusive, but which has at least one cunino acid deleted from 
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the shortest possible "complete" CTP unit (i.e. from positions 
118-145) , The "partial" CTP units included in the invention 
preferably contain at least one 0-glycosylat ion site if agonist 
activity is desired. Some nonglycosylated forms of the hormones 
5 are antagonists and are useful as such. The CTP unit contains 
four such sites at the serine residues at positions 121 (site 
1); 127 (site 2); 132 (site 3); and 138 (site 4). The partial 
forms of CTP useful in agonists of the invention will contain 
one or more of these sites arranged in the order in which they 

10 appear in the native CTP sequence. Thus, the "partial" CTP unit 
employed in agonists of the invention may include all four 
glycosylation sites; sites 1, 2 and 3; sites 1, 2 and 4; sites 
1, 3 and 4; sites 2, 3 and 4; or simply sites 1 and 2; 1 and 3; 
1 and 4 ; 2 and 3 ; 2 and 4 ; or 3 and 4 ; or may contain only one 

15 of sites 1, 2, 3 or 4. 

By "tandem" inserts or extensions is meant that the 
insert or extension contains at least two "CTP units". Each CTP 
unit may be complete or a fragment, and native or a variant. 
All of the (TTP units in the tandem extension or insert may be 

20 identical, or they may be different from each other. Thus, for 
example, the tandem extension or insert may generically be 
partial - complete ; partial -part ial ; partial - complete -partial ; 
complete- complete-partial , and the like wherein each of the 
noted partial or complete CTP units may independently be either 

25 a variant or the native sequence. 

The "linker moiety" is a moiety that joins the oc and /3 
sequences without interfering with the activity that would 
otherwise be exhibited by the same of and jS chains as members of 
a heterodimer, or which alters that activity to convert it from 

30 agonist to antagonist activity. The level of activity may 

change within a reasonable range, but the presence of the linker 
cannot be such so as to deprive the single- chain form of both 
substantial agonist and substantial antagonist activity. The 
single-chain form must remain as a single -chain form when it is 

35 recovered from its production medium and must exhibit activity 
pertinent to the hormonal activity of the heterodimer, the 
elements of which form its components. 
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The hormone sxibunits and the CTP units may correspond 
exactly to the native hormone or CTP sequence, or may be 
variants. The nature of the variants has been defined 
5 hereinabove. In such variants, 1-10, preferably 1-8, and most 
preferably 1-5 of the amino acids contained in the native 
sequence are substituted by a different aucnino acid compared to 
the native amino acid at that position, or 1-10, more preferably 
1-8 and most preferably 1-5 amino acids are simply deleted or 

10 combination of these. As pointed out above, when non- critical 
regions of the single chain forms are identified, in particular, 
through detecting the presence of non-critical "loops", the 
number of amino acids altered by deletion or substitution may be 
increased to 20 or 3 0 or any arbitrary number depending on the 

15 length of amino acid sequence in the relevant non- critical 

region. Of course, deletion or substitutions in more than one 
non- critical region results in still greater numbers of amino 
acids in the single chain forms being affected and substitution 
and deletions strategies may be used in combination. The 

20 substitutions or deletions taken cumulatively do not result in 
substantial elimination of agonist or antagonist activity 
associated with the hormone. Substitutions by conservative 
analogs of the native amino acid are preferred. 

"Conservative analog" means, in the conventional 

2 5 sense, an analog wherein the residue substituted is of the same 
general amino acid category as that for which substitution is 
made. Amino acids have been classified into such groups, as is 
understood in the art, by, for example, Dayhoff, M. et al., 
Atlas of Protein Sequences and Structure (1972) £:89-99. In 

30 general, acidic amino acids fall into one group; basic amino 
acids into another; neutral hydrophilic aimino acids into 
another; and so forth. 

More specifically, aimino acid residues can be 
generally sxibclassif ied into four major subclasses as follows: 

35 Acidic: The residue has a negative charge due to loss 

of H ion at physiological pH and the residue is attracted by 
aqueous solution so as to seek the surface positions in the 
conformation of a peptide in which it is contained when the 
peptide is in aqueous medium at physiological pH. 
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Basic: The residue has a positive charge due to 
association with H ion at physiological pH and the residue is 
attracted by aqueous solution so as to seek the surface 
positions in the conformation of a peptide in which it is 
5 contained when the peptide is in aqueous medium at physiological 
pH. 

Neutral /nonpolar: The residues are not charged at 
physiological pH and the residue is repelled by aqueous solution 
so as to seek the inner positions in the conf oimation of a 
10 peptide in which it is contained when the peptide is in aqp-ieous 
medium. These residues are also designated "hydrophobic" 
herein. 

Neutral /polar : The residues are not charged at 
physiological pH, but the residue is attracted by aqueous 

15 solution so as to seek the outer positions in the conformation 
of a peptide in which it is contained when the peptide is in 
aqueous medium. 

It is understood, of course, that in a statistical 
collection of individual residue molecules some molecules will 

20 be charged, and some not, ajid there will be an attraction for or 
repulsion from an aqueous medixim to a greater or lesser extent . 
To fit the definition of "charged, " a significant percentage (at 
least approximately 25%) of the individual molecules are charged 
at physiological pH. The degree of attraction or repulsion 

25 required for classification as polar or nonpolar is arbitrary 
and, therefore, amino acids specifically contemplated by the 
invention have been classified as one or the other. Most amino 
acids not specifically named can be classified on the basis of 
known behavior. 

30 Amino acid residues can be further subclassif ied as 

cyclic or noncyclic, and aromatic or nonaromatic, self- 
explanatory classifications with respect to the side chain 
substituent groups of the residues, and as small or large. The 
residue is considered small if it contains a total of 4 carbon 

35 atoms or less, inclusive of the carboxyl carbon. Small residues 
are, of course, always nonaromatic. 

For the naturally occurring protein amino acids, 
subclassif ication according to the foregoing scheme is as 
follows. 
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Acidic : Aspartic acid and Glutamic acid; 

Bf^siG/noncyclic : Arginine, Lysine; 

Basic/cvclic : Histidine ; 

y^Ut:r^3i,/P9lay/8P\ftlJl: Glycine, serine, 
5 cysteine; 

Neutral / nonpolar /small : Alanine; 

Neutral /polar/larae/nonaromatic : Threonine, 
Asparagine , Glutamine ; 

Neutral/ polar/larae aromatic: Tyrosine; 

10 Neu t ral / nonpol a r / 1 arae /nonaroma t i c : Valine, 

Isoleucine, Leucine, Methionine; 

Neutral /nonpolar/larae/aromatic i Phenylalanine, and 
Tryptophan, 

The gene-encoded secondary amino acid proline, 

15 although technically within the group neutral /nonpolar/ 

large/cyclic and nonaromatic, is a special case due to its known 
effects on the secondary conformation of peptide chains, and is 
not, therefore, included in this defined group. 

If the single- chain proteins of the invention are 

20 constructed by recombinant methods, they will contain only gene 
encoded aiaino acid substitutions; however > if any portion is 
synthesized by standard, for example, solid phase ^ peptide 
synthesis methods and ligated, for example, enzymatically , into 
the remaining protein, non-gene encoded amino acids, such as 

2 5 aminoisobutyric acid (Aib) , phenylglycine (Phg) , and the like 
can also be substituted for their auaalogous counterparts. 

These non- encoded smino acids also include, for 
excunple, /S-alanine (/3-Ala) , or other omega-amino acids, such as 
3 -amino propionic, 4-amino butyric and so forth, sarcosine 

30 (Sar) , ornithine (Om) , citrulline (Cit) , t-butylalanine 
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(t-BuA) , t-butylglycine (t-BuG) , N-methylisoleucine (N-Melle) , 
and cyclohexylalanine (Cha) , norleucine (Nle) , cysteic acid 
(Cya) 2-naphthylalanine (2-Nal) ; 1, 2 , 3 , 4- tetrahydroisoquinoline- 
S-carboxylic acid (Tic) ; mercaptovaleric acid (^fvl) ; 0-2- 
thienylalanine (Thi) ; and methionine sulfoxide (MSO) • These 
also fall conveniently into particular categories . 
Based on the above definitions, 

Sar and /3-Ala and Aib are neutral /nonpolar/ small; 
t-BuA, t-BuG, N-Melle, Nle, Mvl and Cha are 
neutral/nonpolar/large/nonaromatic; 

Om is basic/noncyclic; 
Cya: is acidic; 

Cit, Acetyl Lys, and MSO are neutral/polar/ 
large/nonaromatic ; and 

Phg, Nal, Thi and Tic are neutral /nonpolar/large/ 

aromatic. 

The various omega-amino acids are classified according 
to size as neutral/nonpolar/small ()S-Ala, i.e., 3- 
aminopropionic, 4-aminobutyric) or large (all others) • 

- Thus, amino acid substitutions other than those 

encoded in the gene can also be included in peptide compounds 
within the scope of the invention and can be classified within 
this general scheme according to their structure. 

Preferred Embodiments of the Single- Chain Hormoneg 

The single -chain hormones of the invention are most 
efficiently and economically produced using recombinant 
techniques. Therefore, those forms of ot and ff chains, CTP units 
and other linker moieties which include only gene-encoded amino 
acids are preferred. It is possible, however, as set forth 
above, to construct at least portions of the single- chain 
hormones using synthetic peptide techniques or other organic 
synthesis techniques and therefore variants which contain 
nongene- encoded amino acids are also within the scope of the 
invention. 

In the most preferred embodiments of the single -chain 
hormones of the invention, the C- terminus of the subunit is 
covalently linked, optionally through a linker, to the 
N-terminus of the mature a subunit; forms wherein the C- terminus 
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s 

of the Of subunit is linked to the N- terminus of the jS sxibunit 
are also useful, but may have less activity either as 
ajitagonists or agonists of the relevant receptor. The linkage 
can be a direct peptide linkage wherein the C-tenninal amino 
5 acid of one subunit is directly linked through the peptide bond 
to the N- terminus of the other; however, in many instances it is 
preferable to include a linker moiety between the two termini. 
In many instances, the linker moiety will provide at least one 
turn between the two chains. The presence of proline residues 

10 in the linker may therefore be advantageous. 

As described above, the N- terminus of the ql chain may 
also be coupled to the N- terminus of the & chain or the 
C- terminus of the ol to the C- terminus of the 0 chain in any case 
through a linker unit, similar combinations are included in the 

15 single chain forms con^rising two ot or two )S subunits. 

It should be understood that in discussing linkages 
between the termini of the subunits comprising the single chain 
forms, one or more termini may be altered by substitution and/or 
deletion as described above. 

20 Preferred embodiments of the single -chain compounds 

containing two subunits are those wherein the C- terminus of 
one unit is linked to the N- terminus of the other, optionally 
through a linker, preferably a peptide linker. Also possible, 
as described above, are linkages between the N- termini of the 

25 two /8 chains or linkages between the C- termini of the two )3 
chains; in these cases, of course, a linker is recpjiired. 

While the head- to-head, tail -to- tail and head- to- tail 
configurations of both the single- chain heterodimer and the 
single- chain two-/? subunit form have been described, the linkage 

30 between the two subunits may also occur at positions not 
precisely at the N- or C- terminus of each member but at 
positions proximal thereto* 

In one particularly preferred set of embodiments, the 
linkage is head- to- tail and the linker moiety will include one 

35 or more CTP units and/or variants or truncated forms thereof . 
Preferred forms of the CTP units used in such linker moieties 
are described hereinbelow. 

Further, the linker moiety may include a drug 
covalently, preferably releasably, bound to the linker moiety. 
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Means for coupling the drug to the linJcer moiety and for 
providing for its release are conventional* 

In addition to their occurrence in the linker moiety, 
CTP and its variants and truncations may also be included in any 
5 noncritical region of the sxibunits making up the single- chain 
hormone. The nature of these inclusions, and their positions, 
is set forth in detail in the parent application herein. 

While CTP units are preferred inclusions in the linker 
moiety, it is understood that the linker may be any suitable 

10 covalently bound material which provides the appropriate spatial 
relationship between the Of and 0 subunits - Thus, for head- to- 
tail configurations the linker may generally be a peptide 
comprising an arbitrary number, but typically less than 100, 
more preferably less than 5 0 amino acids which has the proper 

15 hydrophilicity/hydrophobicity ratio to provide the appropriate 
spacing and confirmation in solution. In general, the linker 
should be on balance hydrophilic so as to reside in the 
surrounding solution and out of the way of the interaction 
between the oc emd' p subunits or the two 0 subunits. It is 

20 preferable that the linker include 0 turns typically provided by 
proline residues. Any suitatble polymer, including peptide 
linkers, with the above -described correct characteristics may be 
used. 

One particular linker moiety that is not included 
25 within the scope of the invention is that which includes a 

signal peptide immediately upstream of the downstream subunit. 

Particularly preferred embodiments of the single- chain 
hormones of the invention include: 

iSFSH-a; i5FSH-iSFSH; jSFSH-^LH; 
30 /SLH-a; j5LH-/SLH; /3LH-/SFSH; 

jSTSH-a; ^TSH-/3TSH; /STSH-/SFSH; 

/SCG-a; /3-CG-i8-CG; iSCG-)8FSH; /SCG-/STSH; 

iSFSH-CTP-a; )3FSH-CTP-iSFSH; /3FSH-CTP-)SLH; 

/SLH-CTP-a; iSLH-CTP-iSLH; BLH-CTP-/JFSH; 
35 ^CG-CTP-a; jSFSH-CTP-CTP-iSLH; 

/3FSH-CTP-CTP-Q;; /3FSH-CTP- CTP-zSTSH^- 
^SLH-CTP-CTP-a; /3LH-CTP-CTP-/SLH; 

)8CG-CTP-CTP-a; /SCG-CTP-CTP-/3LH; 

a-a; a-CTP-a; and a-CTP-CTP-of 
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and the like. Also particularly preferred are the htiman forms 
of the subunits- In the a±>ove constructions, "CTP" refers to 
CTP or its variants or truncations as further explained in the 
paragraph below. 

5 pygfgyy^g gT^lPQcaiipeAtg Qt CTP Units 

The notation used for the CTP units of the invention 
is as follows: for portions of the complete CTP unit, the 
positions included in the portion are designated by their number 
as they appear in Figure 2 herein. Where substitutions occur, 

10 the substituted amino acid is provided along with a superscript 
indicating its position. Thus, for example, CTP (120-143) 
represents that portion of CTP extending from positions 120 to 
143; CTP (120-130; 136-143) represents a fused amino acid 
sequence lacking positions 118-119, 131-135, and 144-145 of the 

15 native sequence. CTP (Arg^^^) refers to a variant wherein the 
lysine at position 122 is sxibstituted by an arginine; CTP 
(Ile^-^^) refers to a variant wherein the leucine at position 134 
is substituted by isoleucine. CTP (Val^^^Val^*-^) represents a 
variant wherein two substitutions have been made, one for the 

2 0 leucine at position 128 and the other for the isoleucine at 
position 142. CTP (120-143; Ile^^s Ala^^°) represents the 
relevant portion of the CTP unit where the two indicated 
substitutions have been made. 

Also preferred among variants of CTP are those wherein 

25 one or more of the O- linked glycosylation sites have been 
altered or deleted. One particularly preferred means of 
altering the site to prevent glycosylation is substitution of an 
alanine residue for the serine residue in these sites. 

Particularly preferred are those CTP units of the 

30 following formulas: 
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Preferred Embodiment is of the cy and B Subunits 

Of course, the native forms of the a and 0 subunits in 
the single- chain form are among the preferred embodiments. 

15 However, certain variants are also preferred. 

In particular, variants of the a subunit in which the 
N- linked glycosylation site at position 52 is eliminated or 
altered by amino 'acid substitutions at or proximal to this site 
are preferred for antagonist activity. Similar modifications at 

20 the glycosylation site at position 78 are also preferred. 

Deletion of one or more amino acids at positions 85-92 also 
affects the nature of the activity of hormones containing the a 
subunit and substitution or deletion of amino acids at these 
positions is also among the preferred embodiments. 

25 Similarly, the N- linked glycosylation sites in the 0 

chain can conveniently be modified to eliminate glycosylation 
and thus affect the agonist or antagonist activity of the 0 
chains. If CTP is present, either natively as in CG or by 
virtue of being present as a linker, the O- linked glycosylation 

30 sites in this moiety may also be altered. 

Particular variants containing modified or deleted 
glycosylation sites are set forth in Yoo, J. et al . J Biol Chem 
(1993) Z£a:13034-13042; Yoo, J. et al . J Biol Chem (1991) 

17741-17743; and Bielinska, M. et al . J Cell Biol (1990) 

35 Hi: 330a (all cited above) and in Matzuk, M,M, et al. J Biol 

Chem (1989) 2^:2409-2414; Keene, J.L, et al . J Biol Chem (1989) 
261:4769-4775; and Keene, J.L. et al. Mol Endocrinol (1989) 
1:2011-2017, 
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Not only may the glycosylation sites per se be 
modified directly, but positions proximal to these sites are 
preferentially modified so that the glycosylation status of the 
mutant will be affected. For the a subunit, for example, 
5 variants in which sunino acids between positions 50-60 are 
substituted, including both conservative and nonconservative 
substitutions, are favored, especially substitutions at 
positions 51, 53 and 55 because of their proximity to the 
glycosylation site at Asn52- 
10 Also preferred are mutants of the a subunit wherein 

lysine at position 91 is converted to methionine or glutamic 
acid. 

Although the variants have been discussed in terms of 
variations in the individual subunits hereinaLbove , it will be 

15 recalled that the single chain forms of the dimer offer 
additional opportunities for modification. Specifically, 
regions that are critical to folding of the dimer may not be 
critical to the correct conformation of the single chain 
molecule and these regions are available for variation in the 

20 single chain form, although not described above in terms of 

individual members of the dimeric forms. Further, the single 
chain forms may be modified dramatically in the context of non- 
critical regions whose alteration and/or deletion do not affect 
the biological activity as described above. 

2 5 Suitable Drugs 

Suitable drugs that may be included in the linker 
moiety include peptides or proteins such as insulin- like growth 
factors; epidermal growth factors; acidic and basic fibroblast 
growth factors; platelet -derived growth factors; the various 

3 0 colony stimulating factors, such as granulocyte CSF, macrophage- 

CSF, and the like; as well as the various cytokines such as IL- 
2, IL-3 and the plethora of additional interleukin proteins; the 
various interferons; tiimor necrosis factor; and the like. 
Peptide- or protein- based drugs have the advantage that they can 
3 5 be included in the single- chain and the entire construct can 

readily be produced by recombinant expression of a single gene. 
Also, small molecule drugs such as antibiotics, 
antiinflammatories, toxins, and the like can be used. 



wo 96/05224 PCT/US95/09664 

- 23 - 

In general, the drugs included within the linker 
moiety will be those desired to act in the proximity of the 
receptors to which the hormones ordinarily bind. Suitable 
provision for release of the drug from inclusion within the 
5 linker will be provided, for example, by also including sites 
for enzyme -catalyzed lysis as further described under the 
section headed^ Preparation Methods hereinbelow. 



Other Modifications 

The single- chain proteins of the invention may be 

10 further conjugated or derivatized in ways generally understood 
to derivatize amino acid sequences, such as phosphorylation, 
glycosylation, deglycosylation of ordinarily glycosylated forms, 
modification of the amino acid side chains (e.g., conversion of 
proline to hydroxyproline) and similar modifications analogous 

15 to those post- translational events which have been found to 
occur generally. 

The glycosylation status of the hormones of the 
invention is particularly important. The hormones may be 
prepared in nonglycosylated form either by producing them in 

20 procaryotic hosts or by mutating the glycosylation sites 

normally present in the subunits and/or any CTP units that may 
be present. Both nonglycosylated versions and partially 
glycosylated versions of the hormones can be prepared by 
manipulating the glycosylation sites. Normally, glycosylated 

25 versions are, of course, also included within the scope of the 
invention. 

As is generally known in the art, the single- chain 
proteins of the invention may also be coupled to labels, 
carriers, solid supports, and the like, depending on the desired 

30 application. The labeled forms may be used to track their 
metcLbolic fate; suitable labels for this purpose include, 
especially, radioisotope labels such as iodine 131, technetium 
99, indium 111, and the like. The labels may also be used to 
mediate detection of the single- chain proteins in assay systems; 

35 in this instance, radioisotopes may also be used as well as 

enzyme labels, fluorescent lalDels, chromogenic labels, and the 
like. The use of such labels is particularly helpful for these 
proteins since they are targeting agents receptor ligand. 
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The proteins of the invention may also be coupled to 
carriers to enhance their immunogenicity in the preparation of 
antibodies specifically iimtiunoreactive with these new modified 
forms. Suitable carriers for this purpose include keyhole 
5 limpet hemocyanin (KLH) , bovine servim albtimin (BSA) and 

diphtheria toxoid, and the like. Standard coupling techniques 
for linking the modified peptides of the invention to carriers, 
including the use of bifunctional linkers, can be employed. 

Similar linking techniques, along with others, may be 
10 employed to couple the proteins of the invention to solid 
supports. When coupled, these proteins can then be used as 
affinity reagents for the separation of desired components with 
which specific reaction is exhibited. 

Preparation Methods 

15 Methods to construct the proteins of the invention are 

well known in the art. As set forth above, if only gene encoded 
amino acids are included, and the single-chain is in a head- to- 
tail configuration, the most practical approach at present is to 
synthesize these materials recombinantly by expression of the 

2 0 DNA encoding the desired protein. DNA containing the nucleotide 
sequence encoding the single -chain forms, including variants, 
can be prepared from native sequences. Technic5[ues for site- 
directed mutagenesis, ligation of additional sequences, PCR, and 
construction of suitable expression systems are all, by now, 

25 well known in the art. Portions or all of the DNA encoding the 
desired protein can be constructed synthetically using standard 
solid phase techniques, preferably to include restriction sites 
for ease of ligation. SuitaQDle control elements for 
transcription and translation of the included coding sequence 

30 can be provided to the DNA coding sequences. As is well known, 
expression systems are .now available compatible with a wide 
variety of hosts, including procaryotic hosts such as bacteria 
and eucaryotic hosts such as yeast, plant cells, insect cells, 
mammalian cells, avian cells, and the like. 

35 The choice of host is particularly to 

posttranslational events, most particularly including 
glycosylation. The location of glycosylation is mostly 
controlled by the nature of the glycosylation sites within the 
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molecule; however, the nature of the sugars occupying this site 
is largely controlled by the nature of the host. Accordingly, a 
fine-tuning of the properties of the hormones of the invention 
can be achieved by proper choice of host. 

A particularly preferred form of gene for the a 
subunit portion, whether the a subunit is modified or 
unmodified, is -the "minigene" construction. 

As used herein, the ot subunit "minigene" refers to the 
gene construction disclosed in Matzuk, M.M. , et al, Mol 
Endocrinol (1988) 2:95-100, in the description of the 
construction of pM^/CG a or pM^/a. This "minigene" is 
characterized by retention only of the intron sequence between 
exon 3 and exon 4, all upstream introns having been deleted. In 
the particular construction described, the N- terminal coding 
sequences which are derived from exon 2 and a portion of exon 3 
are supplied from cDNA and are ligated directly through an Xbal 
restriction site into the coding sequence of exon 3 so that the 
introns between exons I and II and between exons II and III are 
absent. Howeverr the intron between exons III and IV as well as 
the signals 3' of the coding sequence are retained. The 
resulting minigene can conveniently be inserted as a BamHI/Bglll 
segment. Other means for construction of a comparable minigene 
are, of course, possible and the definition is not restricted to 
the particular construction wherein the coding sequences are 
ligated through an Xbal site. However, this is a convenient 
means for the construction of the gene, and there is no 
particular advantage to other approaches, such as synthetic or 
partially synthetic preparation of the gene. The definition 
includes those coding sequences for the of subunit which retain 
the intron between exons III and IV, or any other intron and 
preferably no other introns. 

For recombinant production, modified host cells using 
expression systems are used and cultured to produce the desired 
protein. These terms are used herein as follows: 

A "modified" recombinant host cell, i.e., a cell 
"modified to contain" with the recombinant expression systems of 
the invention, refers to a host cell which has been altered to 
contain this expression system by any convenient manner of 
introducing it, including transf ection, viral infection, and so 
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forth. "Modified" refers to cells containing this expression 
system whether the system is integrated into the chromosome or 
is extrachromosomal . The "modified" cells may either be stable 
with respect to inclusion of the expression system or not. In 
5 short, "modified" recombinant host cells with the expression 
system of the invention refers to cells which include this 
expression system as a result of their manipulation to include 
it, when they natively do not, regardless of the manner of 
effecting this incorporation. 

10 "Expression system" refers to a DNA molecule which 

includes a coding nucleotide sequence to be expressed and those 
accompanying control sequences necessary to effect the 
expression of the coding sequence. Typically, these controls 
include a promoter, termination regulating sequences, and, in 

15 some cases, an operator or other mechanism to regulate 

expression. The control sequences are those which are designed 
to be functional in a particular target recombinant host cell 
and therefore the host cell must be chosen so as to be 
compatible with the control sequences in the constructed 

20 expression system. 

If secretion of the protein produced is desired, 
additional nucleotide sequences encoding a signal peptide are 
also included so as to produce the signal peptide operably 
linked to the desired single- chain hormone to produce the 

25 preprotein. Upon secretion, the signal peptide is cleaved to 
release the mature single -chain hormone. . 

As used herein "cells^ " "cell cultures," and "cell 
lines" are used interchangeaJDly without particular attention to 
nuances of meaning. Vrtiere the distinction between them is 

3 0 important, it will be clear from the context. Where any can be 
meant, all are intended to be included. 

The protein produced may be recovered from the lysate 
of the cells if produced intracellularly , or from the medixim if 
secreted. Techniques for recovering recombinant proteins from 

3 5 cell cultures are well understood in the art, and these proteins 
can be purified using known techniques such as chromatography, 
gel electrophoresis, selective precipitation, and the like. 

All or a portion of the hormones of the invention may 
be synthesized directly using peptide synthesis techniques known 
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in the art. Synthesized portions may be ligated, and release 
sites for any drug contained in the linker moiety introduced by 
standard chemical means. For those embodiments which contain 
amino acids which are not encoded by the gene and those 
5 embodiments wherein the head- to- head or tail -to- tail 

configuration is employed, of course, the synthesis must be at 
least partly at the protein level. Head- to- head junctions at 
the natural N- termini or at positions proximal to the natural 
N- termini may be effected through linkers which contain 
10 functional groups reactive with amino groups, such as 

dicarboxylic acid derivatives. Tail-to-tail configurations at 
the C- termini or positions proximal to the C- termini may be 
effected through linkers which are diamines, diols, or 
combinations thereof . 

15 Antibodies 

The proteins of the invention may be used to generate 
antibodies specif ically immunoreactive with these new compounds. 
These antibodies are useful in a variety of diagnostic and 
therapeutic applications. For example, when the single- chain 

20 forms of the invention are used therapeutically in either human 
or veterinary contexts, the levels of drug may be monitored 
using these antibodies using conventional immunoassay 
techniques. In addition, since some of the antibodies raised by 
these single- chain forms are cross -reactive with the 

25 heterodimer, they can be used to diagnose naturally occurring 
levels of the heterodimer. 

The antibodies are generally prepared using standard 
immunization protocols in mammals such as rabbits, mice, sheep 
or rats, and the antibodies are titered as polyclonal antisera 

30 to assure adequate immunization. The polyclonal antisera can 

then be harvested as such for use in, for example, immunoassays. 
Antibody- secreting cells from the host, such as spleen cells, or 
peripheral blood leukocytes, may be immortalized using known 
techniques and screened for production of monoclonal antibodies 

35 immunospecif ic with the proteins of the invention* 

By "immunospecif ic for the proteins" is meant 
antibodies which are immunoreactive with the single- chain 
proteins, but not with the heterodimers per se within the 
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general parameters considered to determine affinity or 
nonaffinity. It is understood that specificity is a relative 
term, and an arbitrary limit could be chosen, such as a 
difference in immunoreactivity of 100- fold or greater. Thus, an 
5 immunospecif ic antibody included within the invention is at 

least 100 times more reactive with the single- chain protein than 
with the corresponding heterodimers . 

Formulation 

The proteins of the invention are formulated and 

10 administered using methods comparcd^le to those known for the 
heterodimers corresponding to the single- chain form. Thus, 
formulation and administration methods will vary according to 
the particular hormone used* However, the dosage level and 
frequency of administration may be altered as compared to the 

15 heterodimer, especially if CTP units are present in view of the 
extended biological half life due to its presence. 

Formulations for proteins of the invention are those 
typical of protein or peptide drugs such as found in Remington' s 
Pharmaceutical Sciences , latest edition. Mack Publishing 

20 Corr^any, Easton, PA. Generally, proteins are administered by 
injection, typically intravenous, intramuscular, subcutaneous, 
or intraperitoneal injection, or using formulations for 
transmucosal or transdermal delivery. These formulations 
generally include a detergent or penetrant such as bile salts, 

25 fusidic acids, and the like. These formulations can be 

administered as aerosols or suppositories or, in the case of 
transdermal administration, in the form of skin patches. 

Oral administration is also possible provided the 
formulation protects the peptides of the invention from 

3 0 degradation in the digestive system. 

Optimization of dosage regimen and formulation is 
conducted as a routine matter and as generally performed in the 
art. 

Methods of Use 

35 The single- chain peptides of the invention may be used 

in many ways, most evidently as substitutes for the 
heterodimeric forms of the hormones. Thus, like the 
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heterodimers , the agonist forms of the single -chain hormones of 
the invention can be used in treatment of infertility, as aids 
in in vitro fertilization techniques, and other therapeutic 
methods associated with the native hormones, both in humans and 
in animals. 

The single -chain hormones are also useful as reagents 
in a manner similar to the heterodimers . 

In addition, the single- chain hormones of the 
invention may be used as diagnostic tools to detect the presence 
or absence of antibodies with respect to the native proteins in 
biological samples. They are also useful as control reagents in 
assay kits for assessing the levels of these hormones in various 
samples. Protocols for assessing levels of the hormones 
themselves or of antibodies raised against them are standard 
immunoassay protocols commonly known in the art. Various 
competitive and direct assay methods can be used involving a 
variety of labeling techniques including radio- isotope labeling, 
fluorescence laibeling, enzyme labeling and the like. 

The single- chain hormones of the invention are also 
useful: in detecting and purifying receptors to which the native 
hormones bind. Thus, the single- chain hormones of the invention 
may be coupled to solid supports and used in affinity 
chromatographic preparation of receptors or antihormone 
antibodies. The resulting receptors are themselves useful in 
assessing hormone activity for candidate drugs in screening 
tests for therapeutic and reagent candidates. 

Finally, the antibodies uniquely reactive with the 
single- chain hormones of the invention can be used as 
purification tools for isolation of subsequent preparations of 
these materials. They can also be used to monitor levels of the 
single- chain hormones administered as drugs. 

The following examples are intended to illustrate but 
not to limit the invention. 
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Preparat ion of DNA Enco dlncr CGB-a 
Figure 1 shows the construction of an insert for an 
expression vector wherein the C- terminus of the /S- chain of human 
5 CG is linked to the N- terminus of the mature human a subunit. 
As shown in Figure 1, the polymerase chain reaction 
(PCR) is utilized to fuse the two subunits between exon 3 of CG/S 
and exon 2 of the a subunit so that the codon for the carboxy 
terminal amino acid of CGff is fused directly in reading frame to 
10 that of the N- terminal amino acid of the a subunit. This is 
accomplished by using a hybrid primer to amplify a fragment 
containing exon 3 of CG^ wherein the hybrid primer contains a 
"tail" encoding the N- terminal sequence of the o£ subunit. The 
resulting amplified fragment thus contains a portion of exon 2 
15 encoding human CGof. 

Independently, a hybrid primer encoding the N- terminal 
sequence of the a subunit fused to the codons corresponding to 
the C- terminus of CGff is used as one of the primers to amplify 
the a minigene. The two amplified fragments, each now 
20 containing overlapping portions encoding the other subunit are 
together amplified with two additional primers covering the 
entire span to obtain the Sail insert. 

In more detail, reaction 1 shows the production of a 
fragment containing exon 3 of CG/3 and the first four cunino acids 
25 of the mature a subunit as well as a Sail site 5' -ward of the 

coding sequences. It is obtained by amplifying a portion of the 
CG/3 genomic sequence which is described by Matzuk, M.M. et al . 
Proc Natl Acad Sci USA (1987) M:6354-6358; Policastro, P. et 
al. J Biol Chem (1983) 25^: 11492 - 11499 . 
3 0 Primer 1 provides the Sail site and- has the sequence: 

5' -GGA GGA AGG GTG GTC GAC CTC TCT GGT-3' . 

Sail 



35 



The other primer, primer 2, is con^lementary to four 
codons of the a N- terminal sequence and five codons of the CG0 
C- terminal sequence and has the sequence: 
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5' -CAC ATC AGG AGC 



TTG TGG GAG GAT CGG-3' 



The resultant amplified segment which is the product 
of reaction I thus has a Sail site 5' -ward of the fused coding 
5 region. 

In reaction II, an analogous fused coding region is 
obtained from the oc minigene described hereinabove. Primer 3 is 
a hybrid primer containing four codons of the )S subunit and five 
codons of a and has the sequence : 



10 



5' - ATC CTC CCA CAA 



GCT CCT GAT GTG CAG-3 



Primer 4 contains a Sail site and is complementary to 
the extension of a exon 4, Primer 4 has the sequence: 



15 



5'-TGA GTC GAC ATG ATA ATT CAG TGA TTG AAT-3' 
Sail 



Thus, the products of reactions I and II overlap, and 
when subjected to PCR in the presence of primers 1 and 4 yield 
the desired Sail product as shown in reaction III. 

The amplified fragment containing CG/3 exon 3 and the a 
2 0 minigene is inserted into the Sail site of pM^HA-CG)5exonl , 2 an 
expression vector which is derived from pM^ containing CG^ exons 
1 and 2 in the manner described by Sachais, B., Snider, R,M. , 
Lowe, J., Krause, J. J Biol Chem (1993) 2£fi:2319. pM^ 
containing CG/S exons 1 and 2 is described in Matzuk, M.M, et al . 
25 Pr<?g USA (1987) fld:6354-6358 and Matzuk, M.M. et al . J 

Cell Biol (1988) Ififi: 1049-1059 • 

This expression vector then will produce the single- 
chain form human CG wherein the C- terminus of the )S subunit is 
directly linked to the N- terminus of the a subunit. 



30 Exj^p^g 2 

Production and Activity of the Sinale-C hain Human CG 
The expression vector constructed in Excirnple 1 was 
transfected into Chinese hamster ovary (CHO) cells and 
production of the protein was assessed by immunoprecipitation of 
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radiolabeled pro.tein on SDS gels. The culture medium was 
collected and the bioactivity of the single- chain protein was 
compared to the heterodimer in a con^^etitive binding assay with 
respect to the human LH receptor. In this assay, the cDNA 
5 encoding the entire human LH receptor was inserted into the 
expression vector pCMX (Oikawa, J. X-C et al . Mol Endocrinol 
(1991) 5:759-768). Exponentially growing 293 cells were 
transfected with this vector using the method of Chen, C. et al . 
Mol Cell Biol (1987) 7:2745-2752. 

10 In the assay, the cells expressing human LH receptor 

(2 X lO^/tube) were incubated with 1 ng of labeled hCG in 
competition with the sample to be tested at 22®C for 18 hours. 
The samples were then diluted 5- fold with cold Dulbecco's PBS (2 
ml) supplemented with 0.1% BSA and centrifuged at 800 x g for 15 

15 minutes. The pellets were washed twice with D's PBS and 

radioactivity was determined with a gamma counter. Specific 
binding was 10-12% of the total labeled (iodinated) hCG added in 
the absence of sample. The decrease in laJoel in the presence of 
sample measures the binding ability in the sample. In this 

20 assay, with respect to the human LH receptor in 293 cells, the 

wild- type hCG had an ED5Q of 0.47 ng and the single- chain protein 
had an EDg^ of 1.1 ng. 

In an additional assay for agonist activity, 
stimulation of cAMP production was assessed. In this case, 293 

25 cells expressing human LH receptors (2 x 10^/tube) were 

incubated with varying concentrations of the heterodimeric hCG 
or single- chain hCG and cultured for 18 hours. The 
extracellular cAMP levels were determined by specific 
radioimmunoassay as described by Davoren, J.B. et al . Biol 

30 Reorod (1985) 11:37-52. In this assay, the wild-type had an ED5Q 
of 0.6 ng/ml and the single- chain form had an ED5Q of 1.7ng/ml. 
(EDgQ is 50% of the effective dose.) 

Thus, in all cases, the behavior of both the wild -type 
and single- chain forms is similar. 

35 Example 3 

Additional Activity Aggays 
The medium from CHO cells transfected with an 
expression vector for the /SFSH-CTTP-a single-chain construct was 



wo 96/05224 



- 33 - 



PCTAJS95/09664 



recovered and assayed as described in Example 2, The results of 
the competition assay for binding to FSH receptor are shovm in 
Figure 3. The results indicate that the single- chain form is 
more effective than either wild- type FSH or FSH containing a CTP 
extension at the 0 chain in inhibiting binding of FSH itself to 
the receptor- The ED^q for the single -chain form is 
approximately 50 mlU/ml while the ED5Q for the extended 
heterodimer is somewhat over 100 mlU/ml, That for wild- type FSH 
is about 120 mlU/ml. 

The results of the signal transduction assay are shown 
in Figure 4, The effectiveness of all three types of FSH is 
comparable. 

gx^pj,^ 4 

Construction of Additional Exp reggion Vectors 
In a manner similar to that set forth in Example 1, 
expression vectors for the production of single- stranded FSH, 
TSH and LH {)8FSH-a, /SFSH-CTP-a, ^TSH-a, /STSH-CTP-a, /3LH-a, 
)8LH-CTP-a) are prepared and transfected into CHO cells. The 
resulting hormones show activities similar to those of the wild- 
type form, when assayed as set forth in Example 2. 

Similarly, expression vectors for the "two-)S" single- 
chain forms are constructed in a manner analogous to that set 
forth in Example 1 and expressed and assayed as described in 
Example 2 . 
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1. A glycosylated or nonglycosylated protein which 

comprises : 

the amino acid sequence of the a siibunit common to the 
5 glycoprotein hormones linked covalently, optionally through a 
linker moiety, to the amino acid sequence of the 0 subunit of 
one of said hormones, 

wherein said a and 0 subunits consist of the native 
amino acid sequences or variants of said amino acid sequences . 

10 2. The protein of claim 1 wherein said protein 

includes said linker moiety, and said linker moiety optionally 
includes a drug to be targeted to the receptor for the 
glycoprotein hormone. 

3 . The protein of claim l wherein a position 
15 proximal to the C- terminus of the )S subunit is linked 

covalently, optionally through a linker moiety, to a position 
proximal to the N- terminus of the ot subunit, or wherein a 
position proximal to the C- terminus of the ot subunit is linked 
covalently, optionally through a linker moiety, to a position 
2 0 proximal to the N- terminus of the 0 subunit. 

4. The protein of claim 1 wherein the 0 subunit is 
the 0 subunit of human chorionic gonadotropin or variant 
thereof; or 

wherein the 0 subunit is the 0 subunit of FSH or 
25 variant thereof; or 

wherein the 0 subunit is the 0 subunit of FSH extended 
a position proximal to its C- terminus by a complete or partial 
CTP unit or variant thereof; or 

wherein the 0 subunit is the 0 subunit of LH or 
30 variant thereof; or 

wherein the 0 subunit is the 0 svibunit of LH extended 
at a position proximal to its C- terminus by a complete or 
partial CTP unit or variant thereof; or 

wherein the 0 subunit is the 0 subunit of TSH or 
35 variant thereof; or 
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wherein the jS subunit is the )8 sxibunit of TSH extended 
at a position proximal to its C- terminus by a complete or 
partial CTP unit or variant thereof; and/or 

wherein the <x subunit is extended at a position 
5 proximal to its N- terminus by a complete or partial CTP unit or 
variant thereof • 

5. The protein of claim 1 wherein the a subunit or 0 
subunit or both are modified by the insertion of a complete or 
partial CTP unit or variant thereof into a noncritical region 
10 thereof and/or wherein said linker moiety includes a complete or 
partial CTP unit -or variant thereof. 



6, The protein of claim 5 wherein said partial CTP 
unit consists of positions 112-132; 115-132; 116-132; or 118- 
132; or 112-127; 115-127; 116-127; or 118-127; and/or 

15 wherein said CTP has one or more O- linked 

glycosylation sites modified or deleted; and/or 

wherein said noncritical region is proximal to the C- 
terminus ; or 

wherein said noncritical region is proximal to the N- 

2 0 terminus, 

7, The protein of claim 1 wherein the subunit 
contains a modification in one or more N- linked glycosylation 
sites; and/or 

wherein one or both of the N- linked glycosylation 
25 sites of the (x sxibunit have been modified; and/ or 
which is nonglycosylated; and/or 

wherein one or more amino acids at positions 85-92 of 
the Of subunit have been deleted. 

8, A glycosylated or nonglycosylated protein which 

30 comprises: 

the amino acid sequence of the 0 subunit of a 
glycoprotein hormone linked covalently, optionally through a 
linker moiety, to the amino acid sequence of the 0 subunit of 
the scune or different glycoprotein hormone, 
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wherein said subunits consist of the native amino 
acid sequences of said hormones or variants of said cunino acid 
sequences, or 

the amino acid sequence of the ot subunit of a 
5 glycoprotein hormone linked covalently, optionally through a 
linker moiety, to the amino acid sequence of the oc subunit of 
the same or different glycoprotein hormone, 

wherein said ot subunits consist of the native amino 
acid sequences of said hormones or variants of said amino acid 
10 sequences. 

9. The protein of claim 8 wherein a position 
proximal to the C- terminus of one a or 0 subunit is linked 
covalently, optionally through a linker moiety, to a position 
proximal to the N- terminus of the other a or ^ siibunit. 

15 10. The protein of claim 8 wherein said protein 

includes a linker moiety. 

11. A pharmaceutical or veterinary composition which 
comprises the protein of claim 1 or 8 in admixture with a 
suitable pharmaceutical excipient, 

20 12. Antibodies immunospecif ic for the protein of 

claim 1 or 8 . 

13. A DNA or RNA molecule which conprises a 
nucleotide sequence encoding the protein of claim 1 or 8. 

14. An expression system for production of a single- 
25 chain form of a glycoprotein hormone which expression system 

comprises a first nucleotide sequence encoding the protein of 
claim 1 or 8 operably linked to control sequences capable of 
effecting the expression of said first nucleotide sequence. 

15. The expression system of claim 14 which further 
30 contains a second nucleotide sequence encoding a signal peptide 

operably linked to the protein encoded by said first nucleotide 
sequence. 
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16. A host cell modified to contain the expression 
system of claim 15. 

17. A method to produce a single -chain form of a 
glycoprotein hormone 

which method comprises culturing the cells of claim 16 
under conditions wherein said glycoprotein hormone is produced; 
and 

recovering the glycoprotein hormone from the culture. 
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